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INTRODUCTION 


This memorandum outlines a completely parallel algorithm for the symmetric 
eigenproblem AX - ABX. The algorithm is parallel in the sense that the 
numerical operations do not occur in a fixed sequence. Therefore, a large 
number of operations can. bo programmed to be performed, concurrently on a 
computer with multiple central processing units. 

The standard symmetric eigenvalue problem AX - AX has the property that 
the n eigenvalues of the principal submatrix of A of order n are 
separated by the (n - 1) eigenvalues of the principal submatrix of order 
(n - 1) . The separation property delineates n intervals containing one 
eigenvalue. Each eigenvalue and corresponding eigenvector can be computed 
independently. The n eigenproblem calculations can be divided among multiple 
processing units. 

A sufficient condition for the separation property to apply to the general 
symmetric problem AX - ABX is presented in this memorandum. The sufficient 
condition is slightly less restrictive than a condition which ensures that all 
the eigenvalues are real, namely, that B be positive definite. 

A parallel algorithm is readily derived for problems that obey the 
separation property. Once the eigenproblem of order n is solved the algorithm 
solves the eigenproblem of order (n + 1) and continues until n - N, the 
order of the square matrices A and B. Only one pass through the principal 
submatrices is required to solve for all the eigenvalues and eigenvectors of 
AX - ABX. 

Intermediate numerical results for eigenvectors must be stored. These 
results can be shared by the central processing units, or the results can be 
partitioned into separate blocks of data accessed by a single processor. 

The algorithm is numerically stable, avoids factoring matrix B, and 
automatically arranges the eigenvalues in nondecreasing order. Multiple 
eigenvalues and closely spaced eigenvalues do not present any numerical 
difficulties . 

The separation property is regarded in the literature as a theoretical 
property rather than the basis for a practical algorithm. The reason is the 
large number of matrix-vector multiplications required by the algorithm. The 

total muraber of scalar multiplications is on the order of N /4 compared to 
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the comparable number of 5N /3 for the Householder method for the standard 
problem. However, the input-output operations are straightforward for the 
parallel method so that the total computer cost is not necessarily as 
prohibitive as indicated by the count of scalar multiplications. 

The body of the memorandum outlines the derivation of the separation 
property for the general symmetric problem AX - ABX. The steps in the 
algorithm are next written explicitly. Finally, some of the numerical analysis 
for avoiding round-off errors in enforcing the separation property in a code 
for parallel processing is discussed. Special cases of uncoupled modes and 
multiple eigenvalues are included in the discussion. 
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SEPARATION PROPERTY 

The separation property follows from examining the characteristic equation 
for the bordered eigenvalue problem of order n in the form 
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The are eigenvalues of the problem A^ n '^ X - AB^ n ’^X where the 

superscripts in parentheses denote the order of a principal submatrix. The 

bordered form in equation (1) follows from a similarity transformation using 

/ „ _ 1 \ 

IT , the matrix of orthonormal eigenvectors corresponding to the eigenvalues 


A complete proof of the separation property for the standard eigenvalue 
problem is derived in ref. (1). For the standard problem, the coefficients d. 

vanish and b is unity. The effect of these terms on the proof of the 

separation property is shown here. 

The characteristic equation for the matrix of coefficients in equation (1) 
can be written as 
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The separation property is that the roots of equation (2) satisfy the 
conditions 
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where the /k are arranged in nondecreasing order. 
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A sufficient condition for the separation property to hold is derived by 
showing that f(A) has only one real root between consecutive values of the 
as indicated schematically in Fig. 1. Since f(A) goes from positive 

infinity to negative infinity, f(A) has only one root in each interval if the 
slope of f (A) never changes sign and is always negative. 
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Therefore, a sufficient condition for the separation property to hold is that 
the right-hand side of equation (5) is negative. By dropping the positive 
quadratic terms inside the square brackets, a more restrictive condition is 
determined that holds independent of the value of A. 
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( 6 ) 


However, the term A^ is a factor in a product equal to the determinant of 

matrix B^ n ^ and condition (6) holds when matrix B is positive definite, a 
sufficient condition for all of the eigenvalues of AX - ABX to be real. 

The result in equation (5) is not a complete proof of the separation 
property, conditions (3). The special cases of multiplicity of equal 
eigenvalues and of uncoupled modes where c^ - d^ - 0 must be considered, 

but these cases are included in the discussion in ref. (1) for the standard 
symmetric eigenvalue problem and are omitted here. The special cases are 
included in the algorithm presented in the next section. 

The significant result for parallel processing is that the separation 
property allows concurrent computation of the roots of the characteristic 
equation, equation (2) , and of the corresponding eigenvectors from equations 
(1) . The concurrent computation will be examined further after outlining the 
complete parallel algorithm for the problem AX - ABX. 


COMPLETE PARALLEL ALGORITHM 


The concurrent solution of the characteristic equation f(A) - 0, equation 
(2), is one step in the parallel algorithm. The steps in solving the nth order 

problem A^X - AB^X are the following: 

1. Given ft the eigenvalues of A^ n ^X - AB^ n ^X, and the corresponding 

matrix of eigenvectors U v ‘ ' , compute the bordered form, equations 
(1), of order n. 
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2. Solve the characteristic equation for the bordered form, equation (1) , 
for the eigenvalues Aj . 
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3. Compute the elements of the matrix of orthonormal eigenvectors for the 
bordered problem, . 


V ij ’ S ij V nj 

where 
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The nth element of each eigenvector, v n j > determined by normalizing 
the orthogonality relation containing B as a weighting matrix, 
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4. Compute the matrix of orthonormal eigenvectors, U v ' , for the problem 
A (n) X - A B (n) X. 


(n) ”' 1 (n-1) 

U ij “ ^ U ik V kj 
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The complete algorithm updates the order n to (n + 1) , “ Aj 11 ^ > 

and repeats the four steps above until n - N, the order of the original 
problem AX - ABX. This completes the algorithm. The problem is solved for 
all the eigenvalues and corresponding eigenvectors with the eigenvalues 
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automatically arranged in nondecreasing order. The eigenvectors are 
orthonormal with B as a weighting matrix, 

(U< N) ) T B - I 
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All the computations in the four steps with subscript j are independent 
and can be programmed for computation in parallel. The algorithm is almost 
doubly parallel since the row operations with subscript i are also 
independent for each value of i. Any parallel processing with i subscripts 

depends on the data storage of the matrices U^ n '^ and and on their 

accessability by the different central processing units. The possibility of 
parallel processing with the row operations on subscript i is merely noted 
here; the rest of this memorandum examines the parallel operations with the j 
subscript. 


NUMERICAL STABILITY 


Separation Property 

The algorithm summarized by the four steps in the preceding section is 
parallel in theory. In practice, the linear independence of the computed 
intermediate eigenvectors must be preserved without too much round-off error. 
The linear independence is achieved by enforcing the separation property, 
computing accurate roots of the characteristic equation, equations (2), and 
avoiding zero or small divisors in the eigenvector computation, equations (9) . 


In programming the parallel algorithm, the separation property reduces to 
programming each central processing unit to solve equation (8) for a distinct 
Aj such that 

< 5 - 1 * ( 12 ) 


The characteristic equation, f ( Aj ) - 0, is nonlinear and must be solved 

iteratively. An iteration sequence that singles out the root that satisfies 
condition (12) is based on finding roots of another function, G(A), 


G(A) - (Mj^ - A)( Mj - A) f (A) - 0 (13) 

The function G(A) can be written as the difference of two functions, 

G(A) - g x (A) - g 2 (A) (14) 

The first function is a cubic in A, 

g l (A) “ (/i j-l ’ A)(M j * A)(a nn * A b nn > " 

A) (c. - ^ j ) 2 -(M j * A) ( Cj ! - A dj^) 2 (15) 


The second function is 
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If the inequality (6) is satisfied, the cubic g^(A) is guaranteed to have 

three real roots. At least one of the three roots, A - A, satisfies condition 
(12), as shown schematically in Fig. 2. The figure also shows the function 
g£(A) and A ^ , the exact root of G(A) that satisfies conditions (12). The 

zeroth approximation A for A^ can be improved by solving the cubic equation 
in AA generated by replacing A with 


A - A + AA (17) 

in G(A) and retaining up through linear terms in AA in the Taylor's series 
for the terms in brackets in g£(A). The iteration can be continued by 

updating A with A from equation (17) until AA approaches zero. 


Checking The Separation Property 

At some point before the end of step 4, the results from the different 
central processing units for the eigenvalues A^ must be checked for special 

cases. There are two types of special cases. 

The first special case is uncoupled modes where 
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in conditions (3) must be 
A I+1 “ A l“ 


The second special case for the numerical analysis is multiple values of the 

When fij 1 ” “ #*» then Aj - ft. The iterative solution for G(A) must 

recognize the correct order of the roots, and also avoid the apparent division 
by zero in equation (16) when the number of multiple values of exceeds 

two. Both special cases must be flagged at some point to prevent division by 
zero in computing the eigenvectors of the bordered problem, equations (9). 


Further Check on the Separation Property 

Before examining the eigenvector calculation for the special cases, two 
overall checks on the solutions of the characteristic equations are listed 
here. The sum and the product of the roots can be monitored at each value of 
n. Given the sum and the product of the eigenvalues for the problem of order 
(n-1), 
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an independent check is available for the sum and product of the next order 
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The checks on equations (19) follow from clearing the characteristic equation, 
f (A) — 0, of fractions and writing it as an nth degree polynomial. The 
independent checks are then 


n-1 

S(n) - S(n-l) + t a nn + 2 d i (/x 1 d i - 2 c^]/^ 


(20a) 


n-1 

P(n) - -P(n-l) [a - 2 c./fi.]/A 

v 1 nn i"i J ' n 


(20b) 


where is defined in equation (6) . 


Linear Independence of Eigenvectors 

Two special cases were identified above that must be flagged to prevent 
division by zero in computing eigenvectors of the bordered problem. To avoid 
round-off errors in the eigenvector computation, a second subscript I(j) can 
be defined that is related to every eigenvalue A^ by the condition that the 

absolute value of (/i^-A^) is a minimum for the set of values (/x^-Aj). The 

quantity (/i^- A^) appears in the iterative solution of G(A^) - 0 and can be 

computed at that time rather than performing the indicated subtraction when A^ 

is nearly equal to Then, if (/j^-A^) is less than (c^- *- n 

absolute value, an alternate form of equations (9) is preferable for the 
eignvector computation, 
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The special case of an uncoupled mode with - d^ - 0 is a limiting case of 
equations (21) with - 1 and the remaining - 0. The second special 

case of multiple eigenvalues /ij ^ is also a limiting case of 

equations (21) where equation (21c) takes the form 
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Therefore, if the I(j) subscripts are set properly and the quantities 
(jtj '•Vj) are computed precisely, the eigenvectors will be linearly 

independent. The matrix of eigenvectors \P ; will be orthonormal without the 
need for checking linear independence. The accurate computation of the 
quantities - A.) is discussed further in the appendix. 


CONCLUDING REMARKS 

A completely parallel algorithm has been presented for the symmetric 
eigenproblem AX - ABX. The algorithm is straightforward and can serve as a 
benchmark problem for testing the efficiency of concurrent central processing 
units . 

The operation count for the algorithm is relatively high, but the storage 
reqirements and input-output operations are well-ordered. The total computer 
cost for the algorithm may prove to be competitive with more conventional 
algorithms . 

This memorandum outlines the basic algorithm for dense matrices. The 
algorithm can be adapted to banded matrices. It can also be modified for 
constrained problems where the B matrix is singular. An example of the 
latter problem arises when Lagrangian multipliers are introduced to constrain 
vibration modes. The formulation with multipliers retains symmetry, but results 
in some zero elements on the diagonal of the B matrix. The zero elements act 
to alter the separation property, but the algorithm remains parallel. 


APPENDIX 

Computing Small Roots of Cubic Equations 

The body of this memorandum outlines the solution of G(A) , equation (14), 
by an iterative sequence of cubic equations. As the iteration converges, one 
root of the cubic equations approaches zero. To prevent round-off errors 
affecting the final result of the iteration, the standard algorithm for real 
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roots of cubic equations should be slightly modified. The standard algorithm 
has a change of variables that can be the source of round-off error. 

Setting up the iterative sequence also involves a change of variables 

indicated by A - A + AA in equation (17). For later use in computing 
eigenvectors, the change in variables should also explicitly include two other 
expressions 


* A) - - A) + AA (Ala) 

(Mj - A) - (^ - A) + AA (Alb) 


As A is updated by AA to approach A^ at each iteration step, the 
quantities in equations (Al) are updated in a similar fashion. 


The final accuracy of the iteration depends on computing small roots of a 
sequence of cubic equations of the form 


3 2 

e^x + e£x + e£x + e^ - 0 (A2) 

where x - AA. At each iteration step, an accurate solution of equation (A2) 
is required for the smallest root in absolute value or else for the root that 

gives a current value of A^ - A that satisfies conditon (12). 

The standard algorithm, ref. 2, divides equation (A2) by the coefficient 
e^ to reduce it to the form 

3 2 

x + e^x + e^x + e Q - 0 (A3) 

Next, a change of variables is introduced, 

y - x + (e 2 /3) (A4) 

The change of variables transforms the cubic in x to the canonical form 

y 3 + 3py + 2q - 0 (A5) 


where 

p - -(e 2 /3) 2 + (ej/3) 


q - (e 2 /3) 3 - ( 6l /2)(e 2 /3) + (* o /2) 

The standard algorithm for solving equation (A5) computes a discriminant D 
whose sign determines the number of real roots, 


rx 3 J 2 
D - p + q 


(A6) 
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However, when e Q is small compared to e 2> less round-off error results from 
computing D from an equivalent expression 

D- (e 2 /3) 2 [e o (e 2 /3) - (e 2 /12)] + (e Q /2) [ (e Q /2) - 6^/3) ] (A7) 

When the discriminant D is negative, the cubic equation has real roots. The 
three real roots of the cubic in x are 


x^ - -(e 2 /3) - 2r cos a 
x 2 - [-(e 2 /3) + r cos a] 

X 3 “ l ' + r cos “3 

where 
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and the sign of each square root is positive. 

When the discriminent D is positive, the cubic equation has two complex 
roots and one real root, 


X 1 " * (e 2 /3) + U + V 

(A9a) 

x 2 - -(e 2 /3) - (U+V)/2 + i (3 1/2 /2)(U-V) 

(A9b) 

x 3 - -(e 2 /3) - (U+V)/2 - i (3 1/2 /2)(U-V) 

(A9c) 

where 


U - [-q + (D) 1/2 ] 1/3 , V - [-q - (D) 1/2 ] 1/3 , and the signs 
are chosen so that UV - -p. 

on the cube roots 


Theoretically, the iteration sequence for computing the 


looks for 


real roots, x - AA, from the cubic equations. However, round-off error can 
return a small positive value for D when D is near zero. When D is 
exactly zero, the cubic has at least two equal roots. The iteration sequence 
searching for real roots can be continued for random small positive D by 
simply dropping the small imaginary parts of the complex roots. 


The iteration sequence converges when e Q - 0 in equation (A3) . If the 

assumption is made that the computation for e^ contains less round-off error 

than the round-off error introduced by the change in variables in equation 
(A4) , then the smallest root of the cubic in x can also be computed from 


11 


X 1 - AA - -e o /(x 2 x 3 ) (AlO) 

where the x^ from equations (A8) have been reorderd in nondecreasing absolute 
value . 
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Figure 2. Schematic diagram showing roots of the cubic function 
of g^X), zeros of g2*^’ and where 8^) = g 2 ^)* 
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